CMake and linking fixes for PyPi builds#174
Conversation
The pip-build path was globbing every .so in rdkit.libs/ into RDKit_LIBS
and every libboost_* into Boost_LIBRARIES, then linking each into every
nvmolkit Python module. That dragged libcairo and libquadmath onto each
module's NEEDED list. libcairo in turn NEEDEDs libXrender/libX11/libXext;
those are on the manylinux lib_whitelist so rdkit-pypi's auditwheel pass
legally left them external, but the nvidia/cuda runtime container we test
in doesn't ship them, so `import nvmolkit.fingerprints` failed at load.
Narrow RDKit_LIBS to the 16 components nvmolkit actually uses (mirroring
the conda path's explicit list) and Boost_LIBRARIES to the same 4 (boost
serialization / iostreams / python<ver> / numpy<ver>). Also patchelf with
--force-rpath so the entry-point modules get DT_RPATH instead of
DT_RUNPATH. The libs inside rdkit.libs/ have no rpath of their own and
rely on RPATH inheritance to resolve second-level deps; rdkit's own
python bindings do the same. Drop the unused ${RDKit_LIBS} link from
_arrayHelpers, which touches zero RDKit symbols.
Verified the rdkit==2026.3.1 py3.12 wheel built with this change loads
and passes the full pytest suite (390 passed, 10 long deselected) on an
H200 in the manylinux+CUDA container with no system X/font libs.
|
| Filename | Overview |
|---|---|
| admin/distribute/repair_wheel.sh | Added --force-rpath to patchelf call so entry-point .so files carry DT_RPATH (inherited by transitive deps) instead of DT_RUNPATH (not inherited); comment expanded to explain the rationale. |
| cmake/boost.cmake | BOOST_TARGET_LIBS now shared between both build paths; pip path globs exactly one .so per named component (serialization/iostreams/python/numpy) instead of filtering the entire rdkit.libs glob, with FATAL_ERROR guards for unexpected match counts. |
| cmake/rdkit.cmake | Introduces NVMOLKIT_RDKIT_COMPONENTS list shared by both code paths; pip path now resolves exactly 16 named libRDKit*.so.* files instead of globbing every file in rdkit.libs/, with FATAL_ERROR guards for missing or ambiguous matches. |
| nvmolkit/CMakeLists.txt | Removes the unused ${RDKit_LIBS} private link from _arrayHelpers, which touches no RDKit symbols; all other module link lines are unchanged. |
Reviews (2): Last reviewed commit: "Merge branch 'main' into pip-wheels-mini..." | Re-trigger Greptile
evasnow1992
left a comment
There was a problem hiding this comment.
Changes look good to me. Thanks!
The pip-build path was globbing every .so in rdkit.libs/ into RDKit_LIBS and every libboost_* into Boost_LIBRARIES, then linking each into every nvmolkit Python module. That dragged libcairo and libquadmath onto each module's NEEDED list. libcairo in turn NEEDEDs libXrender/libX11/libXext; those are on the manylinux lib_whitelist so rdkit-pypi's auditwheel pass legally left them external, but the nvidia/cuda runtime container we test in doesn't ship them, so
import nvmolkit.fingerprintsfailed at load.Narrow RDKit_LIBS to the 16 components nvmolkit actually uses (mirroring the conda path's explicit list) and Boost_LIBRARIES to the same 4 (boost serialization / iostreams / python / numpy). Also patchelf with --force-rpath so the entry-point modules get DT_RPATH instead of DT_RUNPATH. The libs inside rdkit.libs/ have no rpath of their own and rely on RPATH inheritance to resolve second-level deps; rdkit's own python bindings do the same. Drop the unused ${RDKit_LIBS} link from _arrayHelpers, which touches zero RDKit symbols.